Math Primer

Let $X$ and $Y$ be two Gaussian random variables with zero mean and variances $\sigma_1^2$ and $\sigma_2^2$ and a correlation coefficient $\rho$. Then $Z=XY$ follows the distribution

$$p_Z(z) = \frac{1}{\pi \sigma_1 \sigma_2 \sqrt{1-\rho^2}} \exp\left(\frac{\rho z}{\sigma_1\sigma_2(1-\rho^2)}\right) K_0\left(\frac{|z|}{\sigma_1\sigma_2(1-\rho^2)}\right) $$

Source: https://ieeexplore.ieee.org/document/7579552

$E(Z) = \rho \sigma_1 \sigma_2$, by definition. This is achieved through the skewness of the distribution. The logarithmic divergence of $K_0$ near 0 implies that the distribution has its peak at 0 and exhibits a kink-like behavior.

$Var(Z) = (1+\rho^2)\sigma_1^2\sigma_2^2$ which is the least when $X$ and $Y$ are independent.

Given a sample of $Z$, we can either estimate $\rho$ and $\sigma_1\sigma_2$ by measuring the sample mean and sample variance or by fitting the $p_Z(z)$ curve to a histogram.

If we have multiple realizations of $Z$, then the average of all noise realizations $\overline{Z}$ is Gaussian distributed due to Central Limit Theorem.

Set up tickets/DM-32700 branch of pipe_tasks

Choose between RC2 and DC2

Run one of the following two blocks depending on whether you want to look at HSC RC2 data or DESC DC2 data

If the noise in calexp is Gaussian (it is in our Monte Carlo simulations), then the noise in deepCoadd_warp and deepCoadd are also Gaussian, since the operation is linear.

Visualize the pure noise images

The set of collections in u/kannawad/DM-23253-noises/ contains images obtained by substituting calexp for a white Gaussian noise image.

Only deepCoadd_warp and deepCoadd datasets are persisted. The calexp are pure noise images made in-situ and are not persisted.

Compute a covariance matrix PER PIXEL and average over noise realizations

It appears like $Z=XY$ has some sort of spatial pattern, or it could be my eye tricking me. To check if there is a persistent pattern, we do an ensemble averaging over multiple noise realizations. If the pattern is persistent, then $Var(\overline{Z}) \lesssim Var(Z)$. If the pattern is random, then $Var(\overline(Z)) \approx Var(Z)/\sqrt{N}$, where $N$ is the number of noise realizations.

Now, do an ensemble averaging, i.e., average over all noise realizations

Let us look at the distribution of the $Z$ values for a handful number noise realizations, and then for the ensemble average.

If the coadd (warp) had uncorrelated noise, then the distribution is a should be centered around 0 and have a standard deviation of $1$. The distribution for individual noise realizations should be $p_Z(z)$ for some parameters, but should tend to Gaussian for the ensemble mean due to CLT.

The mean appears quite robust!

All measurements point to $\rho \approx 0.15$.

Let us check if it is similar for the real data

Visualize the noise regions (+undetected sources) in the image

Let us take a real image and look at it. If nothing, let us make sure we are not reading in pure-noise coadds

The presence of undetected sources will certainly lead to additional positive correlation of the noise.

Conclusions

It remains to be seen if we need a 'full' correlation matrix to calculate the uncertainties in fluxes and shapes more accurately.